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Abstract 

Link-cut trees have been introduced by D.D. Sleator and R.E. Tarjan (Journal of Computer and 
System Sciences, 1983) with the aim of efficiently maintaining a forest of vertex-disjoint dynamic rooted 
trees under cut and link operations. These operations respectively disconnect a subtree from a tree, and 
join two trees by an edge. Additionally, link-cut trees allow to change the root of a tree and to perform 
a number of updates and queries on cost values defined on the arcs of the trees. All these operations are 
performed in 0(log n) amortized or worst-case time, depending on the implementation, where n is the 
total size of the forest. 

In this paper, we show that a list of elements implemented using link-cut trees (we call it a log -list) 
allows us to obtain a common running time of 0(log n) for the classical operations on lists, but also for 
some other essential operations that usually take linear time on lists. Such operations require to find the 
minimum/maximum element in a sublist defined by its endpoints, the position of a given element in the 
list or the element placed at a given position in the list; or they require to add a value a, or to multiply 
by — 1, all the elements in a sublist. 

Furthermore, we use log-lists to implement several existing algorithms for sorting permutations by 
transpositions and/or reversals and/or block-interchanges, and obtain ()(n log n) running time for all 
of them. In this way, the running time of several algorithms is improved, whereas in other cases our 
algorithms perform as well as the best existing implementations. 

Keywords: efficient data structure; double-linked list; link-cut tree; dynamic list ranking; permutation 
sorting 


1 Introduction 

Many data structures have been defined up to now, allowing to store, modify and query a set of elements 
(see El for a non-exhaustive introduction). Each of these data structures has specificities related to the 
type of the set to be stored (disjoint elements or not, ordered elements or not), but overall the operations 
to be performed as efficiently as possible are: insert an element, delete an element, find an element, find 
the maximum/minimum element, find the predecessor/successor of an element (if it is defined), modify 
(the identifying key of) an element. 

When the set is ordered (then we call it a list), an important number of applications exists where sev¬ 
eral consecutive elements must be inserted/deleted/modified simultaneously. The above-mentioned data 
structures applied to the ordered case allow only a sequential treatment of each element. Only linked- 
lists allow to insert or delete consecutive elements in/from the sets they represent, but the absence of a 
sublinear worst-case time access to an element given by its position makes that all the other operations 
on the linked-lists are too time-consuming. 

In this paper, we use link-cut trees introduced in fl2l to define a data structure, that we call a log- 
list, on which a significant number of useful operations take 0(log n) time. These operations include 
the classical operations on lists (delete, insert a sublist), but also, for instance, simultaneously adding 
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a real value to the values of a sublist, finding the minimum value in a sublist, or finding the element at 
the ;'-th position in the list (see the next section for more precisions). We subsequently use log-lists to 
improve the running time of several existing algorithms for sorting a permutation by (selected types of) 
transpositions and/or reversals and/or block-interchanges. 

The paper is organized as follows. In Section[2] we present the operations to be performed on log- 
lists and the implementation of log-lists as link-cut trees. In Section]!] we recall the general features of 
link-cut trees, and propose additional features. In Section [4] we show that log-lists achieve (9(log n) 
running time for all the operations we propose for them, including when the elements have weights 
and the operations are performed on the weights rather than on the elements. In Section[5]we give the 
aforementioned applications to permutation sorting. Section[ 6 ]is the conclusion. 


2 Log-lists 

Let L = (,x'i, x- 2 , ■ ■ ■, x n ) be an ordered set (or list) of not necessarily distinct elements with values from 
a numerical set E. We wish to perform in (9(log n) time (and less when possible) the operations below, 
termed list-operations , on the list L. We assume each element is given by a pointer to it. An element is 
therefore seen as a cell containing the value of the element. Thus having (a pointer to) an element and 
getting the value of the element are two distinct requests. The list-operations are: 

• first(L), which returns (a pointer to) the first element in (non-empty) L, or null if L is empty 

• last(L), which returns (a pointer to) the last element in (non-empty) L, or null if L is empty 

• get-value(list L, element x ), which returns the value of the element x of L. 

• succ(list L, element x), which returns the value of the element immediately following x in L if 
x 7 ^ last(L) 

• precdist L, element x), which returns the value of the element immediately preceding x in L, if 
x 7 ^ first(L) 

• insertdist L, list L\, element x) which inserts a list L\ in (non-empty) L immediately after the 
element x (similarly: before x), and returns L. 

• delete(list L, element x, element y) which deletes the sublist L \ of L defined by its first element 
x and its last element y, and returns L and L-\. 

• reverse(list L, element x, element y) which reverses the order of the elements in the sublist of L 
defined by its first element x and its last element y, and returns the new list L. 

• find-min(list L, element x, element y) which returns an occurrence of the minimum value in the 
sublist of L defined by its first element x and its last element y. 

• find-max(list L, element x, element y) which is similar to find-min but requires the maximum ele¬ 
ment. 

• adddist L, element x, element y, real a) which adds a real value a to the value of each element in 
the sublist of L defined by its first element x and its last element y, and returns the new list L. 

• change-sign(list L , element x, element y) which multiplies by —1 the value of each element in the 
the sublist of L defined by its first element x and its last element y, and returns L. 

• find-rank(list L, element x) which returns the value i such that x is the z-th element of the list L 
( i.e. x = Xi). 

• find-element(list L , integer i) which returns a pointer to the z-th element of the list L (i.e. to xi). 


When the elements in L have weights (including keys), similar operations may be performed in 


(9(log n) on the weights. See Section 4.2 


In a classical double-linked list, the operations first, last, get-value, succ, prec, insert, delete and reverse 
need 0(1) time, whereas the other operations need 0{n) time. Our aim is to balance the running 
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Figure 1: Log-list for the list L = {8, 5, —4, 6}. a) link-cut tree T(L) in standard form, b) link-cut tree T(L) in 
non-standard form. The arc labels are the pairs val\ index. 


times of these operations, seeking 0(log n) time or less for each of them. The powerful link-cut trees 
developed by Sleator and Tarjan in llT2l in order to deal with dynamic trees apply here in the particular 
case where all trees are paths (as we show below), but need additional features that we provide in the 
next section. 

We call support of a directed graph the undirected graph obtained by removing the orientations of 
the arcs. Consider the data structure, named a log -list for L and denoted by H(L), made of: 

• a rooted tree T(L) whose support is a path, built as follows (see Figure [TJ. It vertex set is 

• ■ ■,fn+i} and its edges are given by the pairs 1 ), 1 < i < n. In its standard 

form, the tree T(L) is assumed to have the root t n+ i, with arcs going towards the root. For each 
i, the arc e,; with endpoints t; and fj + i has two costs, namely val(ei) := Xi and index(ei) := i. 
If the list is empty, then the tree is empty. 

• three pointers head(L ), tail(L) and tail+(L ) which point respectively to the vertices t-\ , t n and 
t n + 1 of T{L). These pointers change only when the list L changes, but are not modified when 
the root of the tree T(L) changes. 

Obviously, given a list L we can build in ()(n) time the tree T(L) and the three pointers head(L), 
tail(L), tail+(L). The same data structure is used for all the lists and sublists handled during the 
aforementioned operations. Therefore, at each operation, one or several trees must be handled, and they 
undergo arc deletions and/or arc additions, that cut or link the existing trees. This forest of trees is thus 
implemented using link-cut trees d, and therefore each tree of it is assumed to be a link-cut tree (see 
the next section for more details). 

Remark 1. Notice that, in T(L), the elements of the list are stored as costs on the arcs. The vertices 
in T(L) are not identified with the elements in L. Therefore, when a pointer to an element x, of L is 
given, we assume this means we are provided with a pointer to the source t(xi ) of the arc with cost Xi 
in the standard form of T(L). See Figure [TJ 

Remark 2. Also note that other aggregate operations may be performed similarly to add, since link-cut 
tree support them, as observed in Id- See Section[3]for details about the data structure used by link-cut 
trees. 
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3 Link-cut trees 


3.1 General features 

A link-cut tree is a rooted tree, whose arcs are supposed to be directed towards the root, so that if (u, w) 
is an arc then w is the parent of v. The following dynamic tree operations may be performed on link-cut 
trees, in any order (where cost is a cost with real values, on the arcs of the link-cut trees in the forest). 
Each operation takes O(logn) time 11 21 . 

1 . dparent( vertex v), which returns the parent of v in the tree containing it, or null if v is the root. 

2 . droot(vertex v), which returns the root of the tree containing v. 

3. dcost(vertex v). which returns cost{v , dparent)?;)), provided v is not the root of the tree containing 
it. 

4. dmincost(vertex v), which returns the vertex w closest to droot(u) such that cost(w, dparent(w)) 
is minimum among all vertices w' on the path from v to droot(u). Again, it is assumed that 
v 7 ^ droot(u). 

5. dupdate(vertex v, real a), which adds a to the cost of all edges on the path from v to droot(u). 

6 . dlink (vertex v, vertex w, real a), which assumes that v = droot(u) 7 ^ droot(tu) and adds an arc 
(w, v) with cost a, thus combining the trees containing v and w. 

7. dcut( vertex v), which assumes that v 7 ^ droot(u) and cuts the arc between v and dparent(u), thus 
dividing the tree initially containing v into two trees. 

8 . devert( vertex v), which reverses the direction of all arcs on the path from v to droot(u), thus making 
v the root of the tree. 

Remark 3. Given that the operations dparent, droot, dcost, dmincost and dupdate are static (they do not 
change the forest of link-cut trees), several costs may be simultaneously defined and used in an arbitrary 
order. Naturally, dcut and dlink have a number of cost parameters equal to the number of cost functions 
defined on the arcs of the tree. 

Remark 4. Also note that dcost)?;) takes 0(log n) time, and not 0(1) time as we could expect. This is 
due to the fact that the cost values are not directly stored, but computed using additional information, in 
order to allow simultaneous modifications using dupdate (see Section [3T2] ). 

To achieve 0(log n) running time for all these operations, in lfl2ll the arc set of each link-cut tree 
in the forest is partitioned into solid arcs and dashed arcs. Each vertex has at most one ingoing solid 
arc and, since it has at most one parent, at most one outgoing solid arc. Thus the solid paths , which are 
all the maximal paths formed by solid arcs, partition the vertex set of the link-cut tree (assuming that a 
vertex belonging to no solid arc defines alone a trivial solid path). Each solid path is then represented 
as a binary tree of height Oflog n) whose internal nodes represent the arcs of the solid path (with their 
costs) and whose leaves represent the vertices of the path, in such a way that a symmetric traversal of the 
binary tree results into a “spelling” of the solid path from its head to its tail, including both its vertices 
and its arcs. The binary trees of all solid paths of a link-cut tree, which also contain a lot of additional 
information not described here, are then connected in order to depict the structure of the link-cut tree. 
Note that, following 02 , we use the term vertex for the link-cut trees, and the term node for the binary 
trees. 

This structure (see Figure^ for a summary) has the twofold advantage of being highly parameter- 
izable (the type of the binary tree, the definition of solid and dashed arcs) and of being able to reduce 
operations on link-cut trees to operations on paths, the later ones being themselves reduced to opera¬ 
tions on binary trees. Then, when a dynamic operation on a tree has to be performed: either it is a basic 
operation that may be performed by querying the binary trees representing the tree without modifying 
them, and thus without modifying the solid paths (this is the case of dparent and dcost); or it is a complex 
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operation requiring first that a solid path be built from the vertex v to the root of its tree (this is the case 
of droot, dmincost, dupdate, dcut, devert, with an exception for dlink where the path starts in w instead of v). 
Building the solid path from v to the root of the tree, and thus the binary tree associated with it, is done 
by an operation called adexpose(u), where a stands for auxiliary. Once adexpose is performed, finishing 
the treatment required by droot, dmincost, dupdate and devert needs only to move inside the binary tree, 
querying it or modifying values. However, the two remaining operations dlink and respectively dcut need 
to combine the binary tree obtained by adexpose with another one, and respectively to cut it into two 
trees. Overall, the topological modifications of binary trees are due to adexpose, dlink and dcut, and are 
implemented using the four auxiliary operations below: 

• adconstructfnode r, node s, real x), which combines two binary trees with roots r and s into another 
binary tree with root node having cost x, left child r and right child s. 

• addestroyfnode r), which splits the binary tree with root r into the two subtrees with roots given by 
its left and right child, and returns the two subtrees as well as the cost at node r before splitting. 

• adrotateleftfnode r), which assumes that r has a right child c and performs a left rotation on r, i.e. 
r becomes the left child of c, whose left child becomes the right child of r. The operation returns 
the new root of the binary tree. 

• adrotaterightfnode ?•), which is similar to adrotateleftfnode r), with left and right sides exchanged. 

The nodes of the binary tree store considerable information allowing to perform these four auxiliary 
operations in constant time, independently of the type of the binary tree. However, in order to perform 
the other dynamic tree operations in (9 (log n) time (including adexpose), the type of the binary tree must 
be carefully chosen. With locally biased binary trees, amortized 0(log n ) running time is achieved. 
With globally biased binary trees, worst-case 0(log n) running time is achieved if in addition the solid 
arcs are specifically defined as being the heavy arcs of the link-cut tree. An arc (v, w ) of a link-cut 
tree is heavy if 2 size(v) > size(w), where size(u) denotes the number of vertices in the subtree 
of u, including it. The operations defined above remain valid, with the only difference that when an 
operation modifying the set of solid paths of a link-cut tree is performed (i.e. droot, dmincost, dupdate, 
devert, dlink and dcut, which use adexpose), then it must be followed by a corrective procedure called 
adconceal that transforms the (possible temporarily non-heavy) solid paths into heavy paths. The efficient 
implementation of adconceal needs to augment again the data structure, with data whose update does not 
modify the running times of the other operations. 

Remark 5. In our description, we assume link-cut trees use heavy paths and (locally or globally) biased 
binary trees, in order to achieve the 0(log n) amortized or worst-case running time. However, we do 
not have to go into these details to explain the additional features we add to the standard link-cut trees 
data structure, so that we only use the terms solid paths and binary trees to give our description. 


3.2 Additional features 

In this section, we propose several modifications of the data structure presented above, in order to allow 
the following additional operations on a link-cut tree: 

9. dsearchcostfvertex v, real a), which searches for a vertex w on the path from v to droot(f) such that 
cost(w, dparent(w;)) = a , assuming the costs are strictly increasing as we go up from v to droot(i;). 
The operation returns w, if it exists, or the vertex w' with the largest value cost(w', dparent(ic')) 
smaller than a, if such a w' exists. Otherwise, it returns droot(v). 

10 . dminuscost(vertex v), which multiplies by —1 all the costs on the path from v to droot(u). 

As usual, both these operations start with a call to adexpose(u), which builds the solid path from 
v to droot(ii) and the binary tree Bt associated with it. In Br and as described in ll 2L each internal 
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Configuration 

Representation 

Operation 

Running time 

Source 

Link-cut tree 

Set of solid paths 

dparent(u) 

0(log n) 

H3 



droot(z)) 

0(log n) 

mi 



dcost(u) 

0(log n) 

02 



dmincost(u) 

0(log n) 

02 



dupdate(u, a) 

0(log n) 

Q2l 



dlink(u, w , 0 ) 

0(log n) 

D2) 



dcut(u) 

0(log n) 

El 



devert(u) 

0(log n) 

El 



dsearchcost(u) 

0(log n) 

Sect. 

3.2 




dminuscost(w) 

0(log n) 

Sect. 

3.2 




adexpose(u) 

0(log n) 

El 



adconceal(p) 

0(log n) 

El 

Solid path 

Binary tree 

path operations 




(abstract) 

(not needed here) 



Binary tree 

Locally biased binary tree 

adconstruct(r, s, x) 

0(1) 

HU 

(abstract) 

Globally biased binary tree 

addestroy(r) 

0(1) 

El 



adrotateleft(r) 

0(1) 

El 



adrotateright(r) 

0(1) 

M 


Figure 2: Link-cut trees and their representation levels. Only the operations we refer to in this paper are recorded. 
The 0(log n) running time is the amortized running time for locally biased binary trees, and the worst-case running 
time for globally biased binary trees. 


node e (recall it is an arc of T) stores, among other information, pointers bparent(e) to the parent 
of e in Bt and bleft(e), bright(e) to respectively the left and right child of e in Bt■ Moreover (see 
Figure [3]), it stores two values named netcost(e) and netmin{e), which are related to cost(e) and 
mintree(e) := min{cost(/) | /belongs to the subtree rooted ate} by the following equations lfl2l : 

netcost(e) = cost(e) — mintree{e) 


netmin(e) 


mintree(e) if e is the root of Bt 

mintree(e) — mintree{bparent{e )) otherwise 


Then mintree(e) is equal to the sum of the netmin values on the path in Bt from e (included) to the 
root of Bt (included), and cost(e) is the sum of mintree(e) and netcost(e). Therefore mintree(e) 
and cost{e) are not stored in the tree, but only computed when needed. The values netcost{e) and 
netmin(e) are initialized when the forest of link-cut trees is initialized, in linear time; they are further 
updated in 0(1) when the link-cut trees and/or their solid paths are modified, by the operations handling 
these modifications, namely adconstruct, addestroy, adrotateleft and adrotateright. Note that updating the 
value of netmin at the root of Bt results into an update of mintree(e) and cost(e) in the entire tree, 
in 0(1) time. 

Example. The value mintree(c,d), which is 5, is computed as netmin(c,d) + netmin(d,h ) + 
netminfb , c) = 3 + 0 + 2 = 5. The value cost(c , d), which is also 5, is computed as netcost(c, d) + 
mintree{c 1 d) = 0 + 5 = 5. 

We are now ready to prove the following claim. 

Claim 1. Let T be a link-cut tree, and v one of its vertices. Assume the values of the function cost are 
strictly increasing when going from v to droot(u). Then dsearchcostf vertex v, real a) may be implemented 
in 0(log n) time. 
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c) 


Figure 3: Binary tree for a given solid path, showing only the essential information related to costs (in particular, 
only the bparent pointers are shown, whereas pointers from the parent to its children also exist), a) The solid path 
abcdh and the costs on its edges, b) The information stored in each node (cells drawn with plain lines), and the 
information not stored, but computed with the help of stored information (cells drawn with dotted lines), c) The 
tree built for the solid path in a). 


Proof. Perform adexpose(u) and let i>i,V 2 , ■ ■ ■, v p , with v-y = v, v p = droot(u) and p > 2, be the 
vertices on the solid path going from v to droot(u), in this order. Then, by hypothesis, for each arc 
(•Ui,Vi+ 1 ), 1 < i < p — 2, we have cost(vi, 14 + 1 ) < costUi+ 2 ). 

Let Bt be the binary tree associated with the solid path, and recall that a symmetric traversal of Bp 
allows us to “spell” the solid path from its head to its tail, including as well its vertices (which are the 
leaves of Bp) and its arcs (which are the internal nodes of Bp)- Then, when the costs of the arcs are 
listed by the same symmetric traversal, they are in strictly increasing order, meaning that Bt is a binary 
search tree. We deduce that a classical search for a in Bt allows us to find the arc e with cost a (if any), 
and thus the sought vertex Vi (which is the rightmost leaf in the left subtree of e). If such an arc e is 
not found, then again a classical search allows us to find the arc with largest cost lower than a, and the 
sought vertex Vj. This search takes a time proportional with the height of Bt, that is 0(log n). 

However, this approach assumes the cost of each arc in T is known. Unfortunately, computing 
cost(v, dparent(u)) (which is dcost(u)) takes 0(log n) time, as indicated in Remark[4] and thus com¬ 
puting all the costs of the arcs would take 0(n log n) time. We therefore need to go deeper into the 
representation of the costs in the binary tree Bt, in order to reduce the running time to 0(log n). Re¬ 
call that in a binary search tree we only need to compare the value we are looking for (here, a) with the 
values belonging to a unique branch of the tree, meaning that we only have to compute the costs of the 
arcs of T encountered in Bt during this branch traversal. If we show that all these 0(log n) costs are 
computed in 0(log n) time, then we are done. 

Now, it is easy to see that the cost values of the arcs e of T encountered in Bt during the search 
of a may be computed in 0(1) time each, when we go down this branch. For this, it is sufficient to 
notice that mintree(e) = mintree(bparent(e )) + netmin(e), except for the root, and that cost(e) is 
computed in 0(1) time using mintree and netcost. Then, all the cost values on the traversed branch 
of Bt are computed in 0(log n) time.l. 

We focus now on the second operation we wish to add, dminuscost. Again, perform adexpose(u) and 
build the binary tree Bt corresponding to the solid path from v to droot(r'). Notice that dupdate(i>, x) 
updates the costs of all the nodes in Bt (and thus of all the arcs on the solid path) in 0(1) time by 
adding x to netmin(r), where r is the root node of Bt- Several dupdate operations may be performed 


7 

























consecutively, and each of them has an immediate effect on netmin(r), implying that we do not have 
to store the real values involved in each such operation and, moreover, that we may perform up-down 
computations by accumulating netmin values as in the proof of Claim[l] On the contrary, if one wants 
to introduce a multiplicative type of update, one has to store the multiplicative value y, since its effect 
cannot be reduced to a multiplication of netmin(r ) by y. If several successive updates hold, both 
additive and multiplicative, then all the real values involved in these updates must be stored. Moreover, 
the up-down computations become inefficient, since at each level one has to compute all the stored 
updates. 

Therefore, dminuscost is limited to multiplications by —1. In this case, we are able to ensure an 
immediate effect on the root of the tree. 

Claim 2. The link-cut tree data structure may be modified such that, additionally to the other dynamic 
tree operations, dminuscostf vertex v) takes 0(log n) time, for each vertex v. The space requirements are 
still in O(n). 

Proof. As dupdate does, dminuscost calls adexpose in order to compute the path from v to droot(u) and 
its associated binary tree. We modify the structure of the binary trees in order to enable efficient sign 
changes. 

Consider the initial state of the forest of link-cut trees, in which the solid paths of each tree have 
been defined, and the binary trees to store them are about to be initialized. The idea of the proof is to 
store redundant information in each node e of each binary tree, so that each of cost(e) and —cost(e) 
may be computed using its own series of netmin values. The series computing cost(e) is the positive 
series, whereas the series computing —cost(e) is the negative series. They are disjoint, and the type 
(positive or negative) of each series is stored in the root r of the tree. A multiplication by —1 of all the 
costs in Bt then only requires to exchange the positive and negative series. 

Formally, we define each node e in the binary tree to have three parts: one of them, denoted e°, 
contains the usual information stored in the node according to , except netcost and netmin ; another 
one, denoted e 1 , contains two real variables netcost(e 1 ) and nefmm(e 1 ) and a pointer up(e 1 ); the 
third one, denoted e 2 , contains two real variables netcost(e 2 ) and netmin[e 2 ), and a pointer up(e 2 ). 
It is assumed that e°, e 1 and e 2 may be pointed to separately. Pointers up(e 1 ) and up(e 2 ) point to 
bparent(e) 1 and to bparent(e) 2 respectively, or vice-versa, if e is not the root. If e is the root, then one 
of them points on its own source node (forming a loop) and the other one is null, according to rules that 
will be presented below. 

Then the (initial) binary tree, whose root is denoted r, may be seen as composed of three binary 
trees (see Figurepk): the basic one given by the 0-parts of the nodes, and the arcs (e, bparent(e))\ the 
1-tree given by r, and the arcs (e®, up(e 1 )) such that up(e l ) = r 1 or there is a path from up(e l ) to r 1 ; 
and the 2-tree given by r 2 , and the arcs (e',up(e 1 )) such that up(e l ) = r 2 or there is a path from up(e z ) 
to r 2 . These three trees are vertex- and arc-disjoint. The 1-tree and the 2-tree are, in some way, dual to 
each other, since one of them computes and updates cost(e), for all e, whereas the other one computes 
and updates — cost{e ), for all e. It is understood that the one that computes costfe ), that we call the 
positive tree, is the one whose up pointer forms a loop (see above). The other one is then the negative 
tree. Its up pointer is null. 

Example. In Figure [dji, the three trees are identified by their root (part r°, r 1 or r 2 of the global 
root r) and by the arcs forming paths joining this root. Basically, the 1-tree uses the nodes and arcs 
on the left (double gray), the 0-tree the nodes and the arcs in the middle (black), and the 2-tree the 
nodes and the arcs on the right (simple gray) of the global tree (This left-middle-right partition changes 
when multiplications by —1 and topological changes occur, see below.). The 1-tree is the same as in 
Figure [3] The 2-tree has different values, but it allows us to compute, in the same way as the 1-tree, the 
—cosf(e) value for each edge e in the solid path. For instance, recall that we computed cost(c, d) as 
netcost(c, d) + mintree(c, d) = 0 + (3 + 0 + 2) = 5. Following the similar path from the node (c, d) 2 
( i.e. the part 2 of the edge (c, d)) up to the root r 2 (= (6, c) 2 ) we compute the value netcost((c, d) 2 ) + 
mintree((c, d) 2 ) = 0 + (0 + 2 + (—7)) = —5, which is exactly — cost{c , d). 
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1-tree 2-tree 

positive tree negative tree 


o i i 



1-tree 2-tree 

negative tree positive tree 



Figure 4: a) The binary tree for the solid path abcdh in Figure [5] with parts 0, 1 and 2 that are pointed to 
independently using up pointers (black, double gray and gray arrows respectively), b) The same tree after the 
multiplication of all values by — 1 (the up-pointers of the root have changed) and, subsequently, the addition of the 
value 3 to all values in the tree (+3 has been added to the netmin value at the root r 2 of the positive tree, and has 
been subtracted from the netmin value at the root r 1 of the negative tree). The resulting new values of cost and 
mintree are indicated for information, but they are not stored (and therefore not computed at this level). 


The binary trees are initialized simultaneously for the entire forest in its initial state (whatever this 
state), as indicated below. Then, they are modified through the operations dupdate, dminuscost, as well as 
by the operations adconstruct, addestroy, adrotateleft and adrotateright we mentioned in Section 3.1 which 
must stay within 0(1) running time. Denote: 

mintree{e l ) := mm{cost (/ J ) | / J belongs to the subtree rooted at e 1 }, i = 1,2 


Then, throughout the modifications above, each binary tree is characterized by the following fea¬ 
tures: 


A. Among the 1-tree and the 2-tree, the one with non-null up pointer at its root is the positive tree, i.e. 
it computes cost(e) for all nodes e in the binary tree; the other one, the negative tree, computes 
—cost(e) for all nodes e in the binary tree. 

B. The following equations hold (i = 1,2): 


costae 1 ) = — cost(e 2 ) - , |cosf(e 1 )| = |cosf(e 2 )| = \cost(e)\ 
netcost(e l ) = cost{e l ) — mintree{e l ) 


netmin(e l ) 


mintree(e 1 ) if e is the root of the binary tree 

mintree{e l ) — mintree(up(e t )) otherwise 


C. Therefore we have: 


mintree{e ) S p belongs to the path from e* to the root ) 

cost{e l ) = netcost(e l ) + mintree(e l ). 

In other words, each of the positive and negative trees has the same properties with respect to the 
costs as the binary tree used in lfl2l . The need to have two such trees come from the need to handle both 
the cost and its negation, which are made possible by the double choice for the up values. 
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In the following, we present the initialization step and the aforementioned operations. Note that A, 
B and C below are satisfied by the initialization step. 

Initialization step. See Figure |4jt. Let B be a binary tree with root r, corresponding to a solid 
path of a link-cut tree T. The binary tree B is built such that up[e l ) = bparent(e) 1 for i = 1,2, and 
up (r 1 ) = r 1 , up(r 2 ) = null. 

For each node e of B, define: 

cost(e 1 ) = cosf(e); cost(e 2 ) = —cost(e) 

mintree(e l ) := min{cosf(/ z ) | f l belongs to the subtree rooted at e 1 }, * = 1,2. 

As in the initial structure in fl2l . none of these values is stored in B. Only the values netcost(e l ) 
and netmin{e l ) are stored, and allow to compute them. These values are defined similarly to ltl2ll : 

netcost(e l ) = cost(e l ) — mintree(e l ), * = 1,2 
netmin{e l ) = mintree{e l ) if e is the root of the binary tree, and 
mintree(e l ) — mintree(bparent(e) 1 ), otherwise. 

Example. In Figure Rk the 1-tree (double gray arrows) contains exactly the same information as the 
unique tree in Figure[3] by the definitions above. It therefore computes the cost values. The 2-tree 
(simple gray arrows) similarly computes the —cost values, as it uses the same definitions, but for the 
values — cost(e) instead of cost(e). For instance, let us see how the values netcost and netmin of the 
node ( d , h) 2 are computed. The minimum value in the subtree with root (d, h ) 2 of the 2-tree is —5 (given 
by cost((c , d ) 2 ), which is — cost(c , d )) and thus mintree((d , h ) 2 ) = —5. Similarly, the minimum value 
in the subtree with root (6, c) 2 of the 2-tree is —7 (given by cost((a , 6) 2 ), which is — cost(a , b )) and thus 
mmfree((b, c) 2 ) = —7. As cost((d,h) 2 ) = —2 we obtain that netcost((d,h) 2 ) = cost((d,h ) 2 ) — 
mintree((d , h) 2 ) = —2 — (—5) = 3 and netmin((d , h) 2 ) = mintree((d, h ) 2 ) — mintree{{b, c) 2 ) = 
-5-(-7) =2. 

Then we have: 

mintreeie ) = belongs to the path from e* to the root of netmin(f ) 
cost{e l ) = netcost(e l ) + mintree{e l ). 

Consequently, the 1-tree computes the values cost(e) for all nodes e of B, whereas the 2-tree com¬ 
putes the costs —cost(e). The pointers upir 1 ) and up(r 2 ) are correctly initialized. 

Modifying the binary tree. See Figure |4j). The binary tree, as defined above, is topologically mod¬ 
ified by several dynamic tree operations, which call the four auxiliary operations adconstruct, addestroy, 
adrotateleft and adrotateright. In our version of the binary tree, each of them is implemented separately 
for the positive and the negative trees, using the same methods as in Dll. Therefore, the operation 
adconstruct(r, s, x) then applies the (simple) adconstruct operation in Ifl2l once for the two positive trees 
and the cost value x for u 1 , and once for the two negative trees and the cost value x for u 2 . The 
operations addestroy(r), adrotateleft(r) and adrotateright(r) also apply twice the same (simple) operations 
in DU- Then, these operations have the same running time as those in D2. that is 0(1) time, and the 
0(log n) running time of the dynamic operations using them follows as in fl2l . 

Additionally to the topological changes, the operations dupdate and dminuscost modify some values 
of the binary tree. In our version of the binary tree, dupdate(u, a:) first performs adexpose(t>) and, in the 
binary tree B associated with the solid path from v to droot(u), adds x to netmin(r l ) and removes x 
from netmin(i' 3 ~ l ), where r l is the root of the positive tree of B. Operation dminuscost(*;) first performs 
adexpose(u) and, in the binary tree B associated with the solid path from v to droot(u), exchanges the 
positive and the negative trees by appropriately modifying the up pointers of their roots. 

Obviously, our version of the binary tree only duplicates the operations in the original version, and 
all the running times are unchanged. I 
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Algorithm 1: insert(L, L\ , x) 


1 H ( L ) «— log-list representing L; 

2 H(Li) t— log-list representing L\, 

3 not- dindex(t(a;)); 

4 Hi 4 — dindex(rai'/(Li)); 
i zt- f(succ(a:)); 

6 H(L) -t— insertto po (i,ii, *); 

7 devert(z); dupdate(dparent(t(*)), no); 

8 devert(tfl;7+(L)); dupdate(z, m); // 

9 return H(L); 


/ / assumes taU+(L) is the root of T(L) 
// assumes tail+(Lf) is the root of T{Lf) 
/ / number of elements in L before x and including x 
/ / number of elements in Li 

/ / assumes pointers to t(x) and z did not change 
/ / updates the indices of the elements coming from L i 
updates the indices of the elements between z and tail+(L) 


4 List-operations on log-lists 

Now, the tree T(L) of any log-list L is seen as a link-cut tree, whose support is a path (see Section [5Ji 
and which has two cost operations, val and index, defined on each of its arcs. The underlying structure 
(i.e. the binary trees associated with its solid paths) are built in 0[n ) when the log-list is initialized. 

4.1 Main result 

We show that: 

Theorem 1. In a log -list, all the list-operations take 0(log n ) time, except for first and last that take 
0(1) time. 

Proof. We assume that, before each operation, T{L) is in standard form (otherwise, we apply 
devert(ta//+(L))). We only have to show how dynamic tree operations allow to perform list-operations. 
Note that L and L\ are both represented as log-lists with cost values val and index, and therefore 
support the dynamic tree operations. Recall that when a pointer is given to an element L in the list, this 
means a pointer is given to the vertex t{x) of T(L) which is the source of the arc recording x in the 
standard form of T(L) (according to Remark]!}. 

Operations first(L) and last(L) need only to return head(L) and tail(L) respectively, and take con¬ 
stant time. 

Operation get-value(L, x) may be realized by a simple call to dval(f (at)), which is a call to dcost when 
the cost function is called val. 

Operations succ(L, at) and prec(L, x) are easy to implement. For succ(L, at), since T(L) is in stan¬ 
dard form, let y be dparent(f(at)). Then the value returned by dval(y) is the successor of x in L, where 
dval is the variant of the function dcost when the cost is given by the function val. For prec(L, at), perform 
devert (head(L)) (which does not change the pointer t(a ;)) and return the value dval(z), where z = t{ at). 
Note here that f(succ(at)) and f(prec(at)) are easy to compute in 0(log n) by respectively returning y, 
and dparent(z). These operations are used in Algorithms[l||2]and[3]below. 

Operation insert( L, L \, x) (see Figure^) is written using classical deletions and insertions of arcs, 
except that each arc deletion is implemented using the dcut dynamic tree operation, whereas each arc 
insertion uses the dlink dynamic tree operation. Notice that, since the elements in L \ are on the arcs of 
T(L i), tail+(Li) is cut from L\ before inserting L\, that is, before appropriately linking it. However, 
the values val and index of the former arcs (t(x), dparent(f(a;)) in T(L) and (tail(Li),tail+(Li)) in 
T(L i) are appropriately recorded on the two new linking arcs (the dlink operation allows it, assuming it 
is extended so as to have two cost parameters instead of one, according to Remark [3} . Once this is done 
by the operation insertt opo (L, Li,x) (not written here), the val values are in the right order. It remains 
to update the index values, so as to ensure that they correctly compute the position of each element of L 
in the list L. This is done as in Algorithm]!] Note that a call to dindex means a call to the dcost dynamic 
tree operation, when cost is replaced by index. 
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Figure 5: Index correction procedures for insert and delete: a) insert(L, L\, x) for L = {8,5,—4,6}, L i = 
{3,12} and x = X 2 = 5, using Algorithm[l] b) delete(L, x, y) for L = {8, 5, 3,12, —4, 6}, x = X 3 = 3 and 
y = x 4 = 12, using Algorithm[2] Each arc indicates its val; index pair. 


Algorithm 2: delete(L, x, y) 

//Note: we also update the values for the deleted sublist L 1 

1 f/(L) f—log-list representing L; // assumes tail+(L) is the root of T(L) 

2 no <— dindex(f(prec(a;))); // number of elements in L, before x 

3 zf- f(succ(t/)); 

4 devert(z); dupdate(/(a;), — no); // updates the index for the elements that will go into list Li 

s ( H ( L ), H(L 1 )) 4— deletefo po (£, x, y); // assumes pointer to 2 is still available 

6 devert(toi/+(L)); 

7 dupdate(2, no + 1 — dindex(2)); // updates the indices of the elements between 2 and tail+(L) 

8 return H (L), H(L\ ); 


Operation delet e(L,x,y) (see Figure [Bf) is written similarly, and outputs the list L as well as the 
list L\ of deleted elements. Notice that a tail tail+(L 1 ) is added to the subtree deleted from T(L) in 
order to form T(L{), allowing its ingoing arc to receive the val cost equal to y and the corresponding 
index value (as they were on the arc (y, dparent(y)) of L). Adding tail+(Li) and recording the costs 
as indicated is done using a dlink operation with a trivial tree containing only tail+(Li) (we assume a 
sufficient number of such trees, at most n, is present in the forest, in order to allow such completion 
operations on log-lists). To ensure its concision, the index correcting Algorithm[2]is written in the most 
general case, where no list is empty. We assume that the (omitted) delete/ ()r ,„ operation performs the 
topological changes as described. As both lists L and Li must be output by the delete algorithm (since 
they are parts of the forest), both of them are updated. 

Operation reversefT, x, y) (see Figure^ consists basically in the deletion from L of the sublist with 
endpoints given by x and y (which is represented as a log-list and is thus a new tree in the forest), 
its reversal using devert(head(Li)) (where head(Li) is t(x)), the update of the pointers head^tail and 
tail +, and the insertion of the resulting list in L. As before, we assume that these changes are realized 
by the omitted operation reverset 0p0l and we only show how to update the values of index. This is done 
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Algorithm 3: reverse(L, x. y) 

//Note: we also update the values for the deleted sublist Li 
t H(L) 4— log-list representing L; // assumes taiI+(L) is the root of T(L) 

2 w 4— £(prec(x)); 0 4— i(succ(t/)); 

3 not- dindex(ui); rii 4— dindex(/(t/)); // number of elements in L, before x, and up to and including y 

4 H ( L) 4 — reversetopo (T, x, y)\ / / assumes pointers to w, z are still available 

s devert(z); u 4— dparent(w); 

6 dminusindex(w); / / let the indices in the sublist with endpoints y and x be negative but in increasing order 

7 dupdate(«, no + m + 1); // the indices in the sublist with endpoints y and x get the right values 

8 devert(tfli/+(L)); return H(L) 
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Figure 6: Index correction procedure for revers e(L,x,y) with L = {8, 5,3,12,—4, 6}, x = X 3 = 3 and 
y = x 4 = 12, using Algorithm]?] Each arc indicates its val; index pair. 


in Algorithm [3] where, for concision reasons, only the most general case is presented. The operation 
dminusindex is the same as the operation dminuscost when the cost is the function index. 

Operation find-min(L, x, y) first performs a call to devert(f(succ(y))), in order to ensure that the arcs 
of the path going from x to droot(x) (which is now £(succ(y))) record the values val corresponding to 
the sublist of L between x and y. Then dminval(f (x)), which is the dynamic tree operation corresponding 
to dmincost when the cost is given by the function val, returns the vertex v closest to the root that has 
minimum value val(v, dparent(u)). 

Operation find-max(L, x, y) applies find-min(L, x, y) once the signs of the elements have been changed 
with devert(y) followed by dminusval(x) (the variant of dminuscost for the cost function val), and returns 
the opposite of the result. 

Operations add(L, x, y, a) and change-sign(L, x, y) are done by simple calls to devert(f (succ(j/))) and 
then to dupdate(f(x), a) and respectively to dminusval(i(x)). 

Operation find-rank(L, x) only needs to return dindex(f(x)) when T{L) is in standard form, since the 
cost function index is correctly updated. Again, dindex is the dcost operation when the cost function is 

index. 

Operation find-element(L, z) performs devert(to;7+(L)) in order to ensure that T{L) is in standard 
form, and applies the variant dsearchindex(/!eaaf(L), i) of dsearchcost when the cost function is index. It 
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is easy to check that the hypothesis of Claim[T|are verified. 

Each list-operation uses a constant-bounded number of dynamic tree operations or computations of 
t(succ(x)) and f(prec(x)) (taking 0(log n) time as observed when succ and prec were discussed), so the 
running time of each list-operation is bounded by 0(log n). I 

Remark 6. Note that the traversal of the entire log-list needs - if we use the operations succ and get-value 
on each element - 0(n log n) time. However, traversing directly the binary tree associated with the 
entire list (in its standard form, and once adexpose ( head{ A)) has been performed) allows to visit and 
modify all the elements in the list in 0{n). 

4.2 Log-lists with weights 

Now that the underlying data structure of a log-list is fixed, and that the large possibilities offered by 
link-cut trees are understood, it is easy to see that we can provide one or several weights for each 
element in L and, for each weight function, add to the list-operations above the following weighted 
list-operations: 

• weight(L, x) which returns the weight of a given element x of L 

• find-min-weight(L, x, y) which returns the minimum weight of an element in the sublist of L defined 
by its first element x and its last element y 

• find-max-weight(L, x. y) which is similar to find-min-weight(L, x , y) but requires the maximum weight 

• add-weight(L, x, y, a) which adds a real value a to the weight of each element in the sublist of L 
defined by its first element x and its last element y, and returns the new list L. 

• change-sign-weight(L, x, y) which multiplies by —1 the weight of each element in the sublist of L 
defined by its first element x and its last element y, and returns the new list L. 

• search-weight(L, b) which assumes that L is sorted in increasing order of the weights (which are 
assumed distinct), and returns an element of weight b (if it exists) or a pointer to the element with 
largest weight lower than b (variant: on the element with smallest weight larger than 6) 

We easily have: 

Theorem 2. In a log -list with weights, the weighted list-operations above are performed in 0(log n ) 
time. 

Proof. The weight of an element x is another cost on the arc (x, dparent(x)) in the initial tree T(L) 
for the list L. The dynamic tree operations dcost, dmincost (as well as its variant dmaxcost), dupdate, 
dminuscost and dsearchcost applied to this new cost allow to perform the weighted-operations. I 


5 Applications 


We give below several applications of log-lists to sorting a permutation by reversals and transpositions. 
Best algorithms for these tasks are usually very long and complex, and therefore showing that a new 
data structure improves their running times is a fastidious task. We therefore chose to show: 1) the 
improvement realized by log-lists on recent (and moderately long) algorithms involving prefix/suffix 


transpositions and reversals (in Subsection 5.2 1 ; 2) the evidence that log-lists generalize permutation 
trees, that are able to efficiently deal with transpositions but not with reversals (in Subsection [53]). 


5.1 Terminology 

The definitions below are limited to the results we refer to in the remaining of the section, and do not 
constitute an exhaustive list of genome rearrangements. The reader is referred to HQ for a detailed survey 
of genome rearrangements, and results on permutation sorting. 
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Figure 7: Improvements over existing algorithms, whose 0(n 2 ) running time was optimal before our implemen¬ 
tation. One cell represents a variant of the permutation sorting, allowing all the operations indicated in the column 
header, in all the versions indicated in the line header. The Source, Result and Best lines respectively give the refer¬ 
ence of the original algorithm, its approximation ratio, and its best current implementation. Note that in five cases 
over six, log-lists easily yield the best implementations, similarly to the implementation we proposed for Algorithm 
0 


A genome is represented as a permutation (also termed unsigned permutation) P = (pi p 2 ... ,p n ) 
on [n] := {1, 2,... n}. In a more precise representation of the genome, the elements may carry a + or 
— sign, in which case the permutation is signed. The inverse permutation of P is denoted P _1 , whereas 
Id denotes the identity permutation. 

The transposition tr(P, i, j, k), where l<i<j<fc<n + l, is the operation that transforms P 
into the following permutation: 

P 0 ■= (Pi P2 ■■■ Pi-1 Pj Pj+1 ■ ■ ■ Pk-l PiPi+1 • • ■ Pj-l Pk ■■■ Pn)- 

In other words, the block of P with endpoints pi and Pj-i is moved between pk-i and pt . The 
reversal rv(P, i. j), where 1 < i < j < n, acts differently on unsigned and signed permutations. When 
P is unsigned, the reversal is also unsigned and it transforms P into the following permutation: 

Pi := {piP2 ■ ■ ■ Pi-1 Pj Pj -1 • ■ • Pi+1 Pi Pj + l ■ ■ ■ Pn)- 

Equivalently, the order of the elements in the block of P with endpoints pi and pj is reversed. When 
P is signed, the reversal is also signed and and it transforms P into the following permutation: 

Pi := (pi p 2 ■■■ Pi-1 ~Pj -Pj-l ■■■ ~Pi+l -~Pi Pj+l ■ ■ ■ Pn)- 

Equivalently, the order of the elements in the block of P with endpoints pt and pj is reversed, and 
the signs are changed. A transposition tr(P, i,j, k) or a reversal rv(P, i,j) is prefix if i = 1, in which 
case they are denoted preftr(P, j, k) and prefrv(P, j) respectively. A suffix transposition or reversal is 
defined in a similar way, but involves a suffix of P instead of a prefix. 

The block interchange bi(P, i,j, k, l), where l<i<j<k<l<n + l, transforms P into the 
following permutation: 

P 2 := {piP2 ■ ■ ■ Pi-l PkPj+l ■ ■ ■ Pl-l Pj ■ ■ ■ Pk-l PiPi+l ■ ■ ■ Pj-l Pl ■ ■ ■ Pn)- 

Equivalently, the underlined blocks are switched. 


5.2 Sorting by Prefix/Suffix Transpositions and Reversals in 0(n log n) time 

Several algorithms in the literature share similar principles for sorting signed or unsigned permutations 
either by transpositions only, or by transpositions and reversals, when all operations are assumed to be 
prefix or suffix. We present in detail one of them, which allows us to precisely state the bottlenecks 
of such an approach in terms of running time. Then we show how to address these bottlenecks with 
log-lists and we conclude on all the similar variants (given in Figure[7]i. 

In Q, the asymptotic 2-approximation algorithm in Algorithm[4]is presented for sorting an unsigned 
permutation by prefix transpositions and prefix reversals. Its running time is of 0(n 2 ). Note that the 
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element n + 1 is added at the end of the permutation. A strip of P is a sequence p,, p, + -\ ,... ,pj, j > i, 
of consecutive integers either in increasing order (yielding an increasing strip ) or in decreasing order 
(yielding a decreasing strip). A singleton is, by definition, both an increasing and a decreasing strip. 

Idea of Algorithm |?j Given the initial set of strips in the permutation, the algorithm performs one 
prefix operation at each step, preferring an operation that performs two strip concatenations to an op¬ 
eration that performs only one concatenation.. To this end, the position i at the end of the first strip is 
identified (steps 4-5). If the first element in P is 1, it is not very useful immediately, so the whole strip is 
sent at the end of the permutation (steps 6-7). If the first element pi is not 1, then the algorithm attempts 
to move a prefix (pi p 2 ■ ■ ■ Pj) with j > i between p\ — 1 and its successor (whose positions are a — 1 
and a; steps 11-17) or between pi + 1 and its successor (whose positions are b — 1 and b; steps 19-25). 
Such a move is performed only if it is possible to chose j such that, once the block (pi... p :) ) is moved, 
two strip concatenations are possible, at both its ends. If such a transposition is not possible, then either 
pi — 1 (steps 27-32) or pi + 1 (steps 34-38) allows us to concatenate two strips either by a reversal or 
by a transposition, depending on the type of the involved strips. 

Our implementation. In our implementation of Algorithm[4] the permutation P is stored as a list L = 
{pi,P 2 , ■ ■ ■ ,Pn} implemented as a log-list. In addition we need, for each element p, in L, two pointers 
b[pi] and e[pi\, such that b[pi ] (respectively e[p,])) is not null if and only if pi is the last (respectively 
the first) element in its strip (in the order from first(L) to last(L)). In this case, b[pt] (respectively e[pi\) 
points to the first (respectively last) element of its strip. Obviously, these pointers of the abstract data 
structure L may be directly added to the underlying dynamic tree structure T(L). We note that: 

1) the operations find-element(L, *) and find-rank(L, x) act similarly to P[i] (= pi) and P _1 [a;], ex¬ 
cept that they take 0(log n) time, that find-element(Z/, i) returns a pointer instead of a value (an 
operation get-value further allows us to obtain this value), and that find-rank(L, x) needs a pointer 
t(x) to the source of the arc containing x (see Remark [TJ. The computation of t{x) may be 
easily performed by noticing that the nodes are stored only once, when they are created, and 
the operations on the log-list affect only the informations at the nodes, and not the nodes them¬ 
selves. We may therefore store a landmark table Lm with entries 1,2,... ,n and such that Lm[x] 
is always a pointer to one of the endpoints of the edge with cost x, initialized as t(x). Any list- 
operation modifying the topology of the tree either leaves t (x) unchanged (insert and delete, except 
at the boundaries of the inserted/deleted sublist) or puts it on the other endpoint of the same edge 
(reverse, except at the boundaries of the reversed sublist). Moreover, at the boundaries the number 
of elements concerned is constant (at most four) and thus updating Lm[x} for these elements is 
easy. Then t,(x) is either Lm[x\ or prec(Tm[a:]), and may be computed in (9(log n). 

2) prefix transpositions and respectively prefix reversals are performed using a delete and an insert 
operation on L, respectively using a reverse operation. Both assume the parameters are given as 
pointers instead of positions, and need (9 (log n) time. Updating the strips, i.e. updating &[] and 
e[], when concatenations hold at the endpoints of the moved or reversed block is also done in 
(7(log n ) time. Indeed, at most two concatenations hold once a block is moved/reversed, and 
- for each of them - updating needs only to cross the boundary between the concatenated strips 
(with prec and succ) and to follow the old b{] and e[] values in order to compute the new values. A 
prefix transposition or prefix reversal, together with the strips update, thus takes (9(log n ) time. 

With these remarks, it seems that we only succeeded to increase the running time of important 
operations instead of reducing it. However, we are able to show that: 

Theorem 3. AIgorithm [4] implemented using log -lists performs in 0(n log n ) time, thus improving the 
0(n 2 ) time needed by the algorithm in dfj. 

Proof. The initialization of the data structure H(L) is obviously done in 0{n ) time since the strips 
are identified by a simple traversal of the list. The while loop is executed ()(n) times J4j. The com¬ 
parison in line 2 is easily done in 0(log n) time by testing whether the index of e[first(L)] is n + 1 or 
not. 
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Algorithm 4: Sorting by Prefix Reversals and Prefix Transpositions |4) 

Input: Permutation P, number n of elements 

Output: Number d of prefix reversals and prefix transpositions needed to sort P 

1 d <— 0; 

2 while P ^ Id do 


* -f- 1; 

while |p i+1 -pi\ = 1 do 

|_ i <r- i + 1; 

if Pi = 1 then 

| P <—preftr(P, i + l,n + 1); 

else 

// Try to find a prefix transposition extending the strips at both ends of the moved block; 

a ■ 


Pr,, 1 -! + 1; la ■ 


P „_\_i + 1; ra ■ 


P” 1 -l- 1 • 

- p i—l i p a — 1 ■-*-»' w J Pa+1 ~ ‘ L ’ 

& <— P-^+i + 1; lb «— P P6 1 -i + 1; rfe «— + 1; 


Pl + 1 1 ’ P6 

if |p a _i -p a | ^ 1 then 

if Pia 7^ 1 and |p ia — 1 Pia | 7^ 1 then 
if la < a then 
[_ P •<— preftr(P, la, a); 


else if p ra 7 ^ 1 and \p ra -i — Pr 

if ra < a then 

| P <— preftr(P, ra, a); 


yl 1 then 


else 


if IP6-1 -Pb\¥= 1 then 

if Pit 7^ 1 and \pw-i — pib\ 7^ 1 then 
if Zfo < b then 
[_ P preftr(P, lb, 6 ); 

else if p r t 7 ^ 1 and |p r b-i — Prt| 7 ^ 1 then 
if rb < b then 
|_ P 7— preftr(P, rb, b); 


else 


// Try to find a prefix reversal/transposition extending one strip of the moved block; 
if Pi < Pi then 

* 1 - P m l -u 

if Px = Px+i + 1 then 
j P 7— prefrv(P, x — 1); 

else 

[_ P 7— preftr(P, i + 1, x + 1); 


else 


p- 1 • 


if p y = Py-i — 1 then 
[ P + preftr(P, i + 1, y + 1); 

else 

|_ P +- prefrv(P, y — 1); 


d -f— d + 1; 


40 return d; 
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Steps 4-5 need 0(1) time with our implementation, since i is the position of the last element in the 
first strip. A pointer iPtr to it is given by iPtr -f- e[first(L)]. 

Steps 6-7 need 0(log n ) time, as pi is obtained with get-value(first(L)) and we have the two pointers 
first(L) and iPtr needed by the transposition (as indicated in 2) above). 

Steps 9-10 also need 0(log n ) time, according to 1) above. 

Steps 11-17 (and similarly 19-25) are also immediate due to 1) and 2). We only have to call 
find-element(Za) and find-element(a) before performing the transposition in step 14 (and similarly for the 
step 17), since in our version of the algorithm we need pointers instead of positions. 

The same remarks hold for steps 27-32 (and similarly 34-38). 

The maximum running time of a step is thus of 0(log n) implying that the overall running time is 
in 0(n log n). The theorem is proved. I 

In Figure[7] the Unsigned Reversals and Transpositions column records on the first line the result of 
Theorem[3] This result may be extended to the two other columns, and both their lines, since all these 
four algorithms are based on the same ideas and have the same bottlenecks as Algorithm[4] find the last 
element of the first strip, compute P[i] (= pi ) and P -1 [:r], and perform transpositions and reversals. 
Each of these operations is performed in 0(log n) with log-lists, as shown before. We only have to 
notice that a signed reversal combines an unsigned reversal and a call to change-sign. 

The remaining cell in Figure[7] which concerns sorting by prefix and suffix variants of transpositions 
and of unsigned reversals, is particular. In this case, the algorithm is based on a graph representation 
of a permutation, requiring at each step to find a convenient edge, and this is not easier with log-lists 
then with classical data structures. The same reason explains the absence from Figure[7]of the columns 
Signed/Unsigned Reversals (only), for which the implementation with log-lists does not immediately 
provide an improvement. 

5.3 Replacing permutation trees by log-lists in Sorting by Transpositions 

In J7), Feng and Zhu introduce a new data structure, called permutation trees, and show that it allows 
us to improve the running time of the 1.5-approximation algorithm for sorting a permutation by trans¬ 
positions 11 from 0 ( 77.2 \/(log 77 )) time to 0(77log n) time. Also, the improvement from 0(n 2 ) to 
0(77 log 77 ) time is achieved in J7) for the exact algorithm in HI for sorting by block interchanges. Re¬ 
cently, the 1.375-approximation algorithm (6] for sorting by transpositions has also been improved from 
0(?7 2 ) time to 0(77 log n) time in j3j, using permutation trees. 

Given a permutation (or only a block of it) of size 77 , a permutation tree stores 0{n) space infor¬ 
mation about it and may be computed in 0{n) time. Moreover, the following operations are performed 
in 0(log 77 ) time on permutation trees: find the element at a given position in the permutation, find the 
position of a given element of the permutation, join the permutation trees of two neighboring blocks, 
split a permutation tree into two permutation trees corresponding to a given decomposition of the per¬ 
mutation into two blocks, and query a given block of the permutation represented by the tree looking 
for the maximum element in the block. 

It is easy to see that log-lists also allow to perform all these operations in 0(log n) time, using 
respectively the operations find-element, find-rank, insert, delete and find-max. Then we have: 

Theorem 4. Log -lists successfully replace permutation trees in the 0(n\og 77 ) time implementations 
of the 1.5- and 1.375-approximation algorithms /7. 31 for sorting by transpositions, as well as of the 
algorithm for sorting by block interchanges f^j. 


6 Conclusion 

The data structure we proposed in this paper has significant interest when compared to existent data 
structures, as it combines the advantages of double-linked lists and those of binary search trees, and 
moreover adds some aggregate operations. Like a double-linked list, it allows us to insert/delete/reverse 
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a sublist by cut and link operations. Like a binary search tree, it allows us to store elements according to 
their value (or key) order, and search for the element with a given value, or with the minimum/maximum 
value, using (basically) a binary search. In addition, log-lists allow us to keep trace of the rank of each 
element in its list, and search the element with a given rank. 

We proposed here several applications to permutation sorting. They show that the optimal 0(n log n) 
running time may be achieved for some algorithms whose main challenges are to handle the rank of the 
elements in the permutation, to perform a transposition or a reversal, and to merge two parts of the 
permutation. These operations are simple using log-lists, as all the difficulties are transferred to a lower 
abstraction level, handled using link-cut trees. Link-cut trees already had a lot of the functions needed 
by log-lists, but still were insufficient without the two supplementary operations we added in this paper. 

Still, some permutation sorting algorithms use graph representations of the permutation, in which 
the search of the best move to perform is difficult mostly due to the graph complexity than to the 
data structure. In these cases, log- lists still allow to perform the transpositions and reversals on the 
permutation, but the running time is not easily improved with log-list. However, as it can be seen even 
in our applications, log-lists have a lot of operations, and we think that a more intensive and clever use 
of aggregate operations could allow further improvements. 
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